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In  this  paper  we  discuss  the  theory,  performance  and  implications  of 
a class  of  Syntactic  Pattern  Recognition  algorithms  which  are  optimal  in  a 
well  defined  sense. 

Assume  that  it  is  desired  to  transmit  a message  consisting  of  a 

c 5 cymlclc  r.ww  a . .u. lc  ai^imuci . ouppuse  lurcner  tnat 

any  such  message  will  be  a well  formed  sentence  in  a language  generated  by 
a known  formal  grammar.  The  message  is  to  be  encoded  and  transmitted  by 
sending  a sequence  of  complex  signals,  one  signal  for  each  symbol  in  the 
message,  through  a noisy  channel. 

The  corrupted  message  is  decoded  in  two  stages.  First,  the  individual 
symbols  are  identified  by  a maximum  a posteriori  probability  decision  rule. 
The  resulting  string  of  symbols  will  not,  in  general,  be  a grammatically 
correct  sentence.  Thus,  in  stage  two,  a parser  which  finds  that  sentence 
in  the  language  which  maximizes  the  product  of  the  individual  symbol 
probabilities  conditioned  on  the  signals  received  at  the  output  of  the 
channel  is  used.  The  decoding  thus  obtained  is  the  maximum  likelihood 
estimator  of  the  transmitted  message. 
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Vitorbi  [1]  treats  the  above  described  problem  as  one  of  estimating 
the  state  sequence  of  a finite  Markov  process.  The  desired  estimator  is 
obtained  by  a dynamic  programming  technique.  Recently,  Fung  and  Fu  [2,3] 
have  described  an  algorithm  for  the  case  of  messages  which  are  sentences 
in  a context  free  language.  Their  procedure  is  based  on  an  algorithm 
given  by  Younger  [4].  We  have  derived  efficient  recursive  procedures 
which  solve  the  problem  for  regular,  one-counter  and  context  free  languages. 
The  space  and  time  complexity  of  these  algorithms  in  terms  of  n,  the  length 
of  the  input,  is  summarized  in  the  table  below. 
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In  addition  to  the  analysis,  we  have  tested  these  algorithms  for 
several  formal  grammars  using  both  real  and  simulated  channels.  Because 
of  the  cfficciency  of  these  algorithms  the  tests  were  conducted  on  several 
thousands  of  sentences.  Some  of  the  test  grammars  had  over  300  production 
rules.  The  tests  were  accomplished  without  special  programming  considerations. 

In  the  course  of  our  experiments  with  the  algorithms,  we  discovered 
an  empirical  measure  of  the  information  content  of  formal  languages.  By 
making  the  signal  to  noise  ratio  of  the  channel  very  low,  we  can  reduce  the 
performance  of  the  single  symbol  decoder  to  the  extent  that  it  makes  a 
random  choice  for  each  symbol  independent  of  the  input  to  the  channel. 
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Although  the  symbol  accuracy  of  the  MAP  parser  also  degrades  with  decreasing 
SNR,  its  limiting  value  is  greater  than  that  of  the  single  symbol  decoder 
alone.  The  difference  between  the  two  limiting  values  is  a measure  of  the 
information  encoded  in  the  grammar. 

For  a given  grammar  G and  the  language,  L(G)  , it  generates 

we  define 

L = {w  e L(G) | I w I = n} 
n 

VR  = {w  e VT*| |w|  = n} 

Then  we  define  the  entropy  H(L(G))  of  the  language  L(G)  by: 

■ - E -IvSf  ^ O 

n 

I'c.  have  obseivtd  that  for  two  grauimais  and  and 

are  in  the  same  order  as  the  differences  between  the  limiting  values  for 

their  single  symbol  decoding  and  MAP  parsing  accuracies. 

Forney  [51  has  listed  several  important  problems  of  the  type  described 
here  and  has  suggested  that  they  be  solved  by  the  Viterbi  Algorithm. 

Because  this  algorithm  has  an  exponential  running  time  it  is  intractible 
for  long  inputs.  Forney  further  suggests  that  secondary  information  be 
used  to  guide  a heuristic  search.  Such  backtracking  procedures  have  been 
used  by  Neely  and  White  [6],  Walker  [7]  and  Levinson  [8]  in  speech  recognition 
algorithms. 

Wc  have  observed  that  by  applying  the  appropriate  one  of  our  algorithms 
to  a variety  of  pattern  recognition  problems  both  the  high  cost  of  the 
Viter  i algorithm  and  the  obvious  disadvantages  of  sub-optimal  bactracking 
procedures  can  be  avoided  and  the  optimal  solution  still  obtained. 
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